Setup

First we need to load the packages we will use in this lesson. We will use tidyverse and here with which you are already familiar from the previous lesson. In addition, we need to load the sf package. sf stands for Simple Features which is a standard defined by the Open Geospatial Consortium for storing and accessing geospatial vector data. PostGIS uses the same standard; so those of you who used PostGIS, might find some resemblances with the functionality of the sf package. If you have not installed it yet, run install.packages("sf") first to install the sf package.

library(tidyverse)  # tools for wrangling, reshaping and visualizing data
library(here)       # managing paths
if (!"sf" %in% installed.packages()) install.packages("sf")
library(sf)         # work with spatial vector data

Open and plot shapefiles (20 + 10 minutes)

Import shapefiles

Let’s start by opening a shapefile. Most of you must be already familiar with shapefiles, a common file format to store spatial vector data used in GIS software. We will read a shapefile with the administrative boundary of Delft with the function st_read() from the sf package. Note that all functions from the sf package start with the standard prefix st_ which stands for Spatial Type. This is helpful in at least two ways: (1) it makes the interaction with or translation to/from software using the simple features standard like PostGIS easy, and (2) it allows for easy autocompletion of function names in RStudio.

boundary_Delft <- st_read(here("data", "delft-boundary.shp"))
## Reading layer `delft-boundary' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/delft-boundary.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 4.320218 ymin: 51.96632 xmax: 4.407911 ymax: 52.0326
## Geodetic CRS:  WGS 84

The st_read() function gives us a message with information about the file that was read in. Additionally, we can use other functions to examine the metadata of the file in more detail.

Shapefile Metadata & Attributes

The st_geometry_type() function gives us information about the geometry type, which in this case is POLYGON.

st_geometry_type(boundary_Delft)
## [1] POLYGON
## 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT ... TRIANGLE

The st_crs() function gives us the coordinate reference system used by the shapefile, which in this case is WGS84. This is not what we want, so we will use the st_transform() function to reproject the file to the right CRS. For the crs argument we can use the EPSG code of the CRS we want to use. For the Netherlands, we want to use the Amersfort / RD New projection, which has the EPSG code 28992.

st_crs(boundary_Delft)
## Coordinate Reference System:
##   User input: WGS 84 
##   wkt:
## GEOGCRS["WGS 84",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["latitude",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["longitude",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     ID["EPSG",4326]]
st_bbox(boundary_Delft)
##      xmin      ymin      xmax      ymax 
##  4.320218 51.966316  4.407911 52.032599
boundary_Delft <- st_transform(boundary_Delft, 28992)
st_crs(boundary_Delft)
## Coordinate Reference System:
##   User input: EPSG:28992 
##   wkt:
## PROJCRS["Amersfoort / RD New",
##     BASEGEOGCRS["Amersfoort",
##         DATUM["Amersfoort",
##             ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4289]],
##     CONVERSION["RD New",
##         METHOD["Oblique Stereographic",
##             ID["EPSG",9809]],
##         PARAMETER["Latitude of natural origin",52.1561605555556,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",5.38763888888889,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9999079,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",155000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",463000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
##         BBOX[50.75,3.2,53.7,7.22]],
##     ID["EPSG",28992]]
st_bbox(boundary_Delft)
##      xmin      ymin      xmax      ymax 
##  81743.00 442446.21  87703.78 449847.95
boundary_Delft
## Simple feature collection with 1 feature and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 81743 ymin: 442446.2 xmax: 87703.78 ymax: 449848
## Projected CRS: Amersfoort / RD New
##   osm_id                       geometry
## 1 324269 POLYGON ((87703.78 442651, ...

Plot a shapefile

Now, let’s plot this shapefile. You are already familiar with the ggplot2 package from this morning. ggplot2 has special geom_ functions for spatial data. We will use the geom_sf() function for sf data.

ggplot(data = boundary_Delft) +
  geom_sf(size = 3, color = "black", fill = "cyan1") +
  ggtitle("Delft Administrative Boundary") +
  coord_sf(datum = st_crs(28992))  # this is needed to display the axes in meters

Challenge

Read in delft-streets.shp and delft-leisure.shp and call them lines_Delft and point_Delft respectively.

Answer the following questions:

  1. What type of R spatial object is created when you import each layer?
  2. What is the CRS and extent for each object?
  3. Do the files contain points, lines, or polygons?
  4. How many spatial objects are in each file?
lines_Delft <- st_read(here("data", "delft-streets.shp"))
## Reading layer `delft-streets' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/delft-streets.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 11244 features and 2 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: 81759.58 ymin: 441223.1 xmax: 89081.41 ymax: 449845.8
## Projected CRS: Amersfoort / RD New

For points data we use leisure data from OSM.

point_Delft <- st_read(here("data", "delft-leisure.shp"))
## Reading layer `delft-leisure' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/delft-leisure.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 298 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 81863.21 ymin: 442621.1 xmax: 87370.15 ymax: 449345.1
## Projected CRS: Amersfoort / RD New

We can check the type of data with the class() function from base R. lines_Delft, for instance, is an object of class "sf", which extends the "data.frame" class.

class(lines_Delft)
## [1] "sf"         "data.frame"

We see that point_Delft has the same class and structure.

class(point_Delft)
## [1] "sf"         "data.frame"

lines_Delft is in the correct projection.

st_crs(lines_Delft)
## Coordinate Reference System:
##   User input: Amersfoort / RD New 
##   wkt:
## PROJCRS["Amersfoort / RD New",
##     BASEGEOGCRS["Amersfoort",
##         DATUM["Amersfoort",
##             ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4289]],
##     CONVERSION["RD New",
##         METHOD["Oblique Stereographic",
##             ID["EPSG",9809]],
##         PARAMETER["Latitude of natural origin",52.1561605555556,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",5.38763888888889,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9999079,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",155000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",463000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
##         BBOX[50.75,3.2,53.7,7.22]],
##     ID["EPSG",28992]]

And so is point_Delft.

st_crs(point_Delft)
## Coordinate Reference System:
##   User input: Amersfoort / RD New 
##   wkt:
## PROJCRS["Amersfoort / RD New",
##     BASEGEOGCRS["Amersfoort",
##         DATUM["Amersfoort",
##             ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4289]],
##     CONVERSION["RD New",
##         METHOD["Oblique Stereographic",
##             ID["EPSG",9809]],
##         PARAMETER["Latitude of natural origin",52.1561605555556,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",5.38763888888889,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9999079,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",155000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",463000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
##         BBOX[50.75,3.2,53.7,7.22]],
##     ID["EPSG",28992]]

When looking at the bounding boxes with the st_bbox() function, we see the spatial extent of the two objects in a projected CRS using meters as units. lines_Delft() and pont_Delft have similar extents.

st_bbox(lines_Delft)
##      xmin      ymin      xmax      ymax 
##  81759.58 441223.13  89081.41 449845.81
st_bbox(point_Delft)
##      xmin      ymin      xmax      ymax 
##  81863.21 442621.15  87370.15 449345.08

Explore and plot by vector layer attributes (40 + 20 minutes)

Let’s have a look at the content of the loaded data, starting with lines_Delft. In essence, an "sf" object is a data.frame with a “sticky” geometry column and some extra metadata, like CRS and geometry type we examined earlier.

lines_Delft
## Simple feature collection with 11244 features and 2 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: 81759.58 ymin: 441223.1 xmax: 89081.41 ymax: 449845.8
## Projected CRS: Amersfoort / RD New
## First 10 features:
##     osm_id  highway                       geometry
## 1  4239535 cycleway LINESTRING (86399.68 448599...
## 2  4239536 cycleway LINESTRING (85493.66 448740...
## 3  4239537 cycleway LINESTRING (85493.66 448740...
## 4  4239620  footway LINESTRING (86299.01 448536...
## 5  4239621  footway LINESTRING (86307.35 448738...
## 6  4239674  footway LINESTRING (86299.01 448536...
## 7  4310407  service LINESTRING (84049.47 447778...
## 8  4310808    steps LINESTRING (84588.83 447828...
## 9  4348553  footway LINESTRING (84527.26 447861...
## 10 4348575  footway LINESTRING (84500.15 447255...

This means that we can manipulate them as data frames. For instance, we can look at the number of variables (columns in a data frame) with ncol().

ncol(lines_Delft)
## [1] 3

In the case of point_Delft those columns are "osm_id", "highway" and "geometry". We can check the names of the columns with the base R function names().

names(lines_Delft)
## [1] "osm_id"   "highway"  "geometry"

We can also preview the content of the object with the head() function.

head(lines_Delft)
## Simple feature collection with 6 features and 2 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: 85107.1 ymin: 448400.3 xmax: 86399.68 ymax: 449076.2
## Projected CRS: Amersfoort / RD New
##    osm_id  highway                       geometry
## 1 4239535 cycleway LINESTRING (86399.68 448599...
## 2 4239536 cycleway LINESTRING (85493.66 448740...
## 3 4239537 cycleway LINESTRING (85493.66 448740...
## 4 4239620  footway LINESTRING (86299.01 448536...
## 5 4239621  footway LINESTRING (86307.35 448738...
## 6 4239674  footway LINESTRING (86299.01 448536...

Challenge

Explore the attributes associated with the point_Delft and boundary_Delft spatial objects.

  1. How many attributes does each have?
  2. What types of leisure points do the points represent? Give three examples.
  3. Which of the following is NOT an attribute of the point_Delft data object?
  1. location B) leisure C) osm_id
ncol(point_Delft)
## [1] 3
ncol(boundary_Delft)
## [1] 2

Using the $ operator already introduced in the morning, we can examine the content of a single variable. We only see two values and NAs with the head() function, so we can [either] increase the number of rows we want to see[, use the function na.omit() to remove NAs completely or use the unique() function to only see the first occurrence of each distinct value (note that NA will still be included once). This is an example of how in R you can do things in many different ways.] Printing the object will also give us the first 10 rows.

head(point_Delft)
## Simple feature collection with 6 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 83839.59 ymin: 443827.4 xmax: 84967.67 ymax: 447475.5
## Projected CRS: Amersfoort / RD New
##      osm_id      leisure                  geometry
## 1 472312297 picnic_table POINT (84144.72 443827.4)
## 2 480470725       marina POINT (84967.67 446120.1)
## 3 484697679         <NA> POINT (83912.28 447431.8)
## 4 484697682         <NA> POINT (83895.43 447420.4)
## 5 484697691         <NA>   POINT (83839.59 447455)
## 6 484697814         <NA> POINT (83892.53 447475.5)
head(point_Delft, 10)
## Simple feature collection with 10 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 82485.72 ymin: 443827.4 xmax: 85385.25 ymax: 448341.3
## Projected CRS: Amersfoort / RD New
##        osm_id       leisure                  geometry
## 1   472312297  picnic_table POINT (84144.72 443827.4)
## 2   480470725        marina POINT (84967.67 446120.1)
## 3   484697679          <NA> POINT (83912.28 447431.8)
## 4   484697682          <NA> POINT (83895.43 447420.4)
## 5   484697691          <NA>   POINT (83839.59 447455)
## 6   484697814          <NA> POINT (83892.53 447475.5)
## 7   549139430        marina POINT (84479.99 446823.5)
## 8   603300994 sports_centre POINT (82485.72 445237.5)
## 9   883518959 sports_centre POINT (85385.25 448341.3)
## 10 1148515039    playground    POINT (84661.3 446818)
# head(na.omit(point_Delft$leisure))  # this is extra
# head(unique(point_Delft$leisure))   # this is extra
point_Delft
## Simple feature collection with 298 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 81863.21 ymin: 442621.1 xmax: 87370.15 ymax: 449345.1
## Projected CRS: Amersfoort / RD New
## First 10 features:
##        osm_id       leisure                  geometry
## 1   472312297  picnic_table POINT (84144.72 443827.4)
## 2   480470725        marina POINT (84967.67 446120.1)
## 3   484697679          <NA> POINT (83912.28 447431.8)
## 4   484697682          <NA> POINT (83895.43 447420.4)
## 5   484697691          <NA>   POINT (83839.59 447455)
## 6   484697814          <NA> POINT (83892.53 447475.5)
## 7   549139430        marina POINT (84479.99 446823.5)
## 8   603300994 sports_centre POINT (82485.72 445237.5)
## 9   883518959 sports_centre POINT (85385.25 448341.3)
## 10 1148515039    playground    POINT (84661.3 446818)
names(point_Delft)
## [1] "osm_id"   "leisure"  "geometry"

Explore values within one attribute

Using the $ operator, we can examine the content of a single field of our lines feature.

head(lines_Delft$highway, 10)
##  [1] "cycleway" "cycleway" "cycleway" "footway"  "footway"  "footway" 
##  [7] "service"  "steps"    "footway"  "footway"

To see only unique values of a categorical variable stored as a factor, we can use the levels() function. Remember, factors were introduced in the morning.

levels(factor(lines_Delft$highway))
##  [1] "bridleway"      "busway"         "construction"   "cycleway"      
##  [5] "footway"        "living_street"  "motorway"       "motorway_link" 
##  [9] "path"           "pedestrian"     "platform"       "primary"       
## [13] "primary_link"   "proposed"       "residential"    "secondary"     
## [17] "secondary_link" "service"        "services"       "steps"         
## [21] "tertiary"       "tertiary_link"  "track"          "trunk"         
## [25] "trunk_link"     "unclassified"

Subset features

We can use the filter() function to select a subset of features from a spatial object, just like with data frames. Let’s select only cycleways from our street data.

cycleway_Delft <- lines_Delft %>% 
  filter(highway == "cycleway")

Our subsetting operation reduces the number of features from 11244 to 1397.

nrow(lines_Delft)
## [1] 11244
nrow(cycleway_Delft)
## [1] 1397

We can also calculate the total length of cycleways.

cycleway_Delft <- cycleway_Delft %>% 
  mutate(length = st_length(.))

cycleway_Delft %>%
  summarise(total_length = sum(length))
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTILINESTRING
## Dimension:     XY
## Bounding box:  xmin: 81759.58 ymin: 441227.3 xmax: 87326.76 ymax: 449834.5
## Projected CRS: Amersfoort / RD New
##   total_length                       geometry
## 1 115550.1 [m] MULTILINESTRING ((86399.68 ...

Now we can plot only the cycleways.

ggplot(data = cycleway_Delft) +
  geom_sf() +
  ggtitle("Slow mobility network in Delft", subtitle = "Cycleways") +
  coord_sf(datum = st_crs(28992))

Challenge

  1. Create a new object that only contains the motorways in Delft.
  2. How many features does the new object have?
  3. What is the total length of motorways?
  4. Plot the motorways.
  5. Extra: follow the same steps with pedestrian streets.
levels(factor(lines_Delft$highway))
##  [1] "bridleway"      "busway"         "construction"   "cycleway"      
##  [5] "footway"        "living_street"  "motorway"       "motorway_link" 
##  [9] "path"           "pedestrian"     "platform"       "primary"       
## [13] "primary_link"   "proposed"       "residential"    "secondary"     
## [17] "secondary_link" "service"        "services"       "steps"         
## [21] "tertiary"       "tertiary_link"  "track"          "trunk"         
## [25] "trunk_link"     "unclassified"
motorway_Delft <- lines_Delft %>% 
  filter(highway == "motorway")

motorway_Delft %>% 
  mutate(length = st_length(.)) %>% 
  select(everything(), geometry) %>%
  summarise(total_length = sum(length))
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTILINESTRING
## Dimension:     XY
## Bounding box:  xmin: 84501.66 ymin: 442458.2 xmax: 87401.87 ymax: 449205.9
## Projected CRS: Amersfoort / RD New
##   total_length                       geometry
## 1 14877.44 [m] MULTILINESTRING ((87395.68 ...
nrow(motorway_Delft)
## [1] 48
ggplot(data = motorway_Delft) + 
  geom_sf(size = 1.5) +
  ggtitle("Fast mobility network", subtitle = "Motorways") + 
  coord_sf()

pedestrian_Delft <- lines_Delft %>% 
  filter(highway == "pedestrian")

pedestrian_Delft %>% 
  mutate(length = st_length(.)) %>% 
  select(everything(), geometry) %>%
  summarise(total_length = sum(length))
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTILINESTRING
## Dimension:     XY
## Bounding box:  xmin: 82388.15 ymin: 444400.2 xmax: 85875.95 ymax: 447987.8
## Projected CRS: Amersfoort / RD New
##   total_length                       geometry
## 1 12778.84 [m] MULTILINESTRING ((85538.03 ...
nrow(pedestrian_Delft)
## [1] 234
ggplot() +
  geom_sf(data = pedestrian_Delft) +
  ggtitle("Slow mobility network", subtitle = "Pedestrian") + 
  coord_sf(datum = st_crs(28992))

Customize plots

Let’s say that we want to color different road types with different colors and that we want to determine those colors.

levels(factor(lines_Delft$highway))
##  [1] "bridleway"      "busway"         "construction"   "cycleway"      
##  [5] "footway"        "living_street"  "motorway"       "motorway_link" 
##  [9] "path"           "pedestrian"     "platform"       "primary"       
## [13] "primary_link"   "proposed"       "residential"    "secondary"     
## [17] "secondary_link" "service"        "services"       "steps"         
## [21] "tertiary"       "tertiary_link"  "track"          "trunk"         
## [25] "trunk_link"     "unclassified"

If we look at all the unique values of the higway field of our street network we see more than 20 values. I will focus on a subset of four values to illustrate the use of distinct colours. We use a piped expression in which we only filter the rows of our data frame that have one of the four given values "motorway", "primary", "secondary", and "cycleway". We also make sure that the highway column is a factor column.

road_types <- c("motorway", "primary", "secondary", "cycleway")

lines_Delft_selection <- lines_Delft %>% 
  filter(highway %in% road_types) %>% 
  mutate(highway = factor(highway, levels = road_types))

Next we define the four colors we want to use, one for each type of road in our vector object. Note that in R you can use named colors like "blue", "green", "navy", and "purple". A full list of named colors can be listed with the colors() function.

road_colors <- c("blue", "green", "navy", "purple")

We can use the defined color palette in ggplot.

ggplot(data = lines_Delft_selection) +
  geom_sf(aes(color = highway)) + 
  scale_color_manual(values = road_colors) +
  labs(color = 'Road Type') +
  ggtitle("Road network of Delft", subtitle = "Roads & Cycleways") + 
  coord_sf()

Adjust line width

Earlier we adjusted the line width universally. We can also adjust line widths for every factor level. Note that in this case the size argument, like the color argument, are within the aes() mapping function. This means that the values of that visual property will be mapped from a variable of the object that is being plotted.

line_widths <- c(1, 0.75, 0.5, 0.25)
ggplot(data = lines_Delft_selection) +
  geom_sf(aes(color = highway, size = highway)) +
  scale_color_manual(values = road_colors) +
  labs(color = 'Road Type') +
  scale_size_manual(values = line_widths) +
  ggtitle("Mobility network of Delft", subtitle = "Roads & Cycleways") + 
  coord_sf()

Challenge

In the example above, we set the line widths to be 1, 0.75, 0.5, and 0.25. In our case line thicknesses are consistent with the hierarchy of the selected road types, but in some cases we might want to show a different hierarchy.

Let’s create another plot where we show the different line types with the following thicknesses:

  • motorways size = 0.25
  • primary size = 0.75
  • secondary size = 0.5
  • cycleway size = 1
levels(factor(lines_Delft_selection$highway))
## [1] "motorway"  "primary"   "secondary" "cycleway"
line_width <- c(0.25, 0.75, 0.5, 1)
ggplot(data = lines_Delft_selection) +
  geom_sf(aes(size = highway)) +
  scale_size_manual(values = line_width) +
  ggtitle("Mobility network of Delft", subtitle = "Roads & Cycleways - Line width varies") + 
  coord_sf()

Add plot legend

Let’s add a legend to our plot. We will use the road_colors object that we created above to color the legend. We can customize the appearance of our legend by manually setting different parameters.

ggplot(data = lines_Delft_selection) + 
  geom_sf(aes(color = highway), size = 1.5) +
  scale_color_manual(values = road_colors) +
  labs(color = 'Road Type') + 
  ggtitle("Mobility network of Delft", 
          subtitle = "Roads & Cycleways - Default Legend") + 
  coord_sf()

ggplot(data = lines_Delft_selection) + 
  geom_sf(aes(color = highway), size = 1.5) +
  scale_color_manual(values = road_colors) + 
  labs(color = 'Road Type') +
  theme(legend.text = element_text(size = 20), 
        legend.box.background = element_rect(size = 1)) + 
  ggtitle("Mobility network of Delft", 
          subtitle = "Roads & Cycleways - Modified Legend") +
  coord_sf()

new_colors <- c("springgreen", "blue", "magenta", "orange")

ggplot(data = lines_Delft_selection) + 
  geom_sf(aes(color = highway), size = 1.5) + 
  scale_color_manual(values = new_colors) +
  labs(color = 'Road Type') +
  theme(legend.text = element_text(size = 20), 
        legend.box.background = element_rect(size = 1)) + 
  ggtitle("Mobility network of Delft", 
          subtitle = "Roads & Cycleways - Modified Legend") +
  coord_sf()

Challenge

Create a plot that emphasizes only roads where bicycles are allowed. To emphasize this, make the lines where bicycles are not allowed THINNER than the roads where bicycles are allowed. Be sure to add a title and legend to your map. You might consider a color palette that has all bike-friendly roads displayed in a bright color. All other lines can be black.

class(lines_Delft_selection$highway)
## [1] "factor"
levels(factor(lines_Delft$highway))
##  [1] "bridleway"      "busway"         "construction"   "cycleway"      
##  [5] "footway"        "living_street"  "motorway"       "motorway_link" 
##  [9] "path"           "pedestrian"     "platform"       "primary"       
## [13] "primary_link"   "proposed"       "residential"    "secondary"     
## [17] "secondary_link" "service"        "services"       "steps"         
## [21] "tertiary"       "tertiary_link"  "track"          "trunk"         
## [25] "trunk_link"     "unclassified"
# First, create a data frame with only those roads where bicycles are allowed
lines_Delft_bicycle <- lines_Delft %>% 
  filter(highway == "cycleway")

# Next, visualise using ggplot
ggplot(data = lines_Delft) +
  geom_sf() +
  geom_sf(data = lines_Delft_bicycle, aes(color = highway), size = 2) +
  scale_color_manual(values = "magenta") +
  ggtitle("Mobility network in Delft", subtitle = "Roads dedicated to Bikes") +
  coord_sf()

Create a map of the municipal boundaries in the Netherlands using the data located in your data folder: nl-gemeenten.shp. Apply a line color to each state using its region value. Add a legend.

municipal_boundaries_NL <- st_read(here("data", "nl-gemeenten.shp"))
## Reading layer `nl-gemeenten' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/nl-gemeenten.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 344 features and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 10425.16 ymin: 306846.2 xmax: 278026.1 ymax: 621876.3
## Projected CRS: Amersfoort / RD New
str(municipal_boundaries_NL)
## Classes 'sf' and 'data.frame':   344 obs. of  7 variables:
##  $ identifica: chr  "GM0014" "GM0034" "GM0037" "GM0047" ...
##  $ naam      : chr  "Groningen" "Almere" "Stadskanaal" "Veendam" ...
##  $ code      : chr  "0014" "0034" "0037" "0047" ...
##  $ ligtInProv: chr  "20" "24" "20" "20" ...
##  $ ligtInPr_1: chr  "Groningen" "Flevoland" "Groningen" "Groningen" ...
##  $ fuuid     : chr  "gemeentegebied.ee21436e-5a2d-4a8f-b2bf-113bddd028fc" "gemeentegebied.6e4378d7-0905-4dff-b351-57c1940c9c90" "gemeentegebied.515fbfe4-614e-463d-8b8c-91d35ca93b3b" "gemeentegebied.a3e71341-218c-44bf-ba12-01e2251ea2f6" ...
##  $ geometry  :sfc_MULTIPOLYGON of length 344; first list element: List of 1
##   ..$ :List of 1
##   .. ..$ : num [1:4749, 1:2] 238742 238741 238740 238738 238735 ...
##   ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA
##   ..- attr(*, "names")= chr [1:6] "identifica" "naam" "code" "ligtInProv" ...
levels(factor(municipal_boundaries_NL$ligtInPr_1))
##  [1] "Drenthe"       "Flevoland"     "Fryslân"       "Gelderland"   
##  [5] "Groningen"     "Limburg"       "Noord-Brabant" "Noord-Holland"
##  [9] "Overijssel"    "Utrecht"       "Zeeland"       "Zuid-Holland"
ggplot(data = municipal_boundaries_NL) +
  geom_sf(aes(color = ligtInPr_1), size = 1) +
  ggtitle("Contiguous NL Municipal Boundaries") + 
  coord_sf()

Plot multiple shapefiles (40 + 20 minutes - 15 minutes from plot costomisation)

So far we learned how to plot information from a single shapefile and do some plot customization. What if we want to create a more complex plot with many shapefiles and unique symbols that need to be represented clearly in a legend?

Now, let’s create a plot that combines our leisure locations (point_Delft), municipal boundary (boundary_Delft) and streets (lines_Delft) spatial objects. We will need to build a custom legend as well.

To begin, we will create a plot with the site boundary as the first layer. Then layer the leisure locations and street data on top using +.

ggplot() + 
  geom_sf(data = boundary_Delft, fill = "grey", color = "grey") +
  geom_sf(data = lines_Delft_selection, aes(color = highway), size = 1) +
  geom_sf(data = point_Delft) +
  ggtitle("Mobility network of Delft") + 
  coord_sf()

Now let’s create a custom legend.

leisure_colors <- rainbow(15)
point_Delft$leisure <- factor(point_Delft$leisure)
ggplot() + 
  geom_sf(data = boundary_Delft, fill = "grey", color = "grey") +
  geom_sf(data = lines_Delft_selection, aes(color = highway), size = 1) + 
  geom_sf(data = point_Delft, aes(fill = leisure), shape = 21) +
  scale_color_manual(values = road_colors, name = "Road Type") + 
  scale_fill_manual(values = leisure_colors, name = "Lesiure Location") + 
  ggtitle("Mobility network and leisure in Delft") + 
  coord_sf()

ggplot() +
  geom_sf(data = boundary_Delft, fill = "grey", color = "grey") +
  geom_sf(data = lines_Delft_selection, aes(color = highway), size = 1) +
  geom_sf(data = point_Delft, aes(fill = leisure), shape = 22) +
  scale_color_manual(values = road_colors, name = "Line Type") +
  scale_fill_manual(values = leisure_colors, name = "Leisure Location") +
  ggtitle("Mobility network and leisure in Delft") + 
  coord_sf()

We notice that there are quite some playgrounds in the residential parts of Delft, whereas on campus there is a concentration of picnic tables. So that is what our next challenge is about.

Challenge

Create a map of leisure locations only including playground and picnic_table, with each point colored by the leisure type. Overlay this layer on top of the lines_Delft layer (the streets). Create a custom legend that applies line symbols to lines and point symbols to the points.

Modify the plot above. Tell R to plot each point, using a different symbol of shape value.

leisure_locations_selection <- st_read(here("data", "delft-leisure.shp")) %>% 
  filter(leisure %in% c("playground", "picnic_table"))
## Reading layer `delft-leisure' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/delft-leisure.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 298 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 81863.21 ymin: 442621.1 xmax: 87370.15 ymax: 449345.1
## Projected CRS: Amersfoort / RD New
levels(factor(leisure_locations_selection$leisure))
## [1] "picnic_table" "playground"
blue_orange <- c("cornflowerblue", "darkorange")
ggplot() + 
  geom_sf(data = lines_Delft_selection, aes(color = highway)) + 
  geom_sf(data = leisure_locations_selection, aes(fill = leisure), 
          shape = 21, show.legend = 'point') + 
  scale_color_manual(name = "Line Type", values = road_colors,
     guide = guide_legend(override.aes = list(linetype = "solid", shape = NA))) + 
  scale_fill_manual(name = "Soil Type", values = blue_orange,
     guide = guide_legend(override.aes = list(linetype = "blank", shape = 21, colour = NA))) + 
  ggtitle("Traffic and leisure") + 
  coord_sf()

ggplot() + 
  geom_sf(data = lines_Delft_selection, aes(color = highway), size = 1) + 
  geom_sf(data = leisure_locations_selection, aes(fill = leisure, shape = leisure), size = 3) + 
  scale_shape_manual(name = "Leisure Type", values = c(21, 22)) +
  scale_color_manual(name = "Line Type", values = road_colors) + 
  scale_fill_manual(name = "Leisure Type", values = rainbow(15),
     guide = guide_legend(override.aes = list(linetype = "blank", shape = c(21, 22),
     color = "black"))) + 
  ggtitle("Road network and leisure") + 
  coord_sf()

Handling Spatial Projections and CRS (40 + 20 minutes)

Working with spatial data from different sources

municipal_boundary_NL <- st_read(here("data","nl-gemeenten.shp"))
## Reading layer `nl-gemeenten' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/nl-gemeenten.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 344 features and 6 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 10425.16 ymin: 306846.2 xmax: 278026.1 ymax: 621876.3
## Projected CRS: Amersfoort / RD New
ggplot() +
  geom_sf(data = municipal_boundary_NL) +
  ggtitle("Map of Contiguous NL Municipal Boundaries") +
  coord_sf()

We can add a country boundary layer to make it look nicer. If we specify a thicker line width using size = 2 for the country boundary layer, it will make our map pop!

country_boundary_NL <- st_read(here("data", "nl-boundary.shp"))
## Reading layer `nl-boundary' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/nl-boundary.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 10425.16 ymin: 306846.2 xmax: 278026.1 ymax: 621876.3
## Projected CRS: Amersfoort / RD New
ggplot() +
  geom_sf(data = country_boundary_NL, color = "gray18", size = 2) +
  geom_sf(data = municipal_boundary_NL, color = "gray40") +
  ggtitle("Map of Contiguous NL Municipal Boundaries") +
  coord_sf()

# st_crs(point_Delft)
st_crs(municipal_boundary_NL)
## Coordinate Reference System:
##   User input: Amersfoort / RD New 
##   wkt:
## PROJCRS["Amersfoort / RD New",
##     BASEGEOGCRS["Amersfoort",
##         DATUM["Amersfoort",
##             ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4289]],
##     CONVERSION["RD New",
##         METHOD["Oblique Stereographic",
##             ID["EPSG",9809]],
##         PARAMETER["Latitude of natural origin",52.1561605555556,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",5.38763888888889,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9999079,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",155000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",463000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
##         BBOX[50.75,3.2,53.7,7.22]],
##     ID["EPSG",28992]]
st_crs(country_boundary_NL)
## Coordinate Reference System:
##   User input: Amersfoort / RD New 
##   wkt:
## PROJCRS["Amersfoort / RD New",
##     BASEGEOGCRS["Amersfoort",
##         DATUM["Amersfoort",
##             ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4289]],
##     CONVERSION["RD New",
##         METHOD["Oblique Stereographic",
##             ID["EPSG",9809]],
##         PARAMETER["Latitude of natural origin",52.1561605555556,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",5.38763888888889,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9999079,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",155000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",463000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["easting (X)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["northing (Y)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Netherlands - onshore, including Waddenzee, Dutch Wadden Islands and 12-mile offshore coastal zone."],
##         BBOX[50.75,3.2,53.7,7.22]],
##     ID["EPSG",28992]]
boundary_Delft <- st_read(here("data", "delft-boundary.shp"))
## Reading layer `delft-boundary' from data source 
##   `/Users/claudiuforgaci/Projects/geospatial-data-carpentry-tud-2022-11/data/delft-boundary.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 4.320218 ymin: 51.96632 xmax: 4.407911 ymax: 52.0326
## Geodetic CRS:  WGS 84
st_crs(boundary_Delft)
## Coordinate Reference System:
##   User input: WGS 84 
##   wkt:
## GEOGCRS["WGS 84",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433]],
##     CS[ellipsoidal,2],
##         AXIS["latitude",north,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433]],
##         AXIS["longitude",east,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433]],
##     ID["EPSG",4326]]
boundary_Delft <- st_transform(boundary_Delft, 28992)
ggplot() +
  geom_sf(data = country_boundary_NL, size = 2, color = "gray18") +
  geom_sf(data = municipal_boundary_NL, color = "gray40") +
  geom_sf(data = boundary_Delft, color = "purple", fill = "purple") +
  ggtitle("Map of Contiguous NL Municipal Boundaries") +
  coord_sf()

Challenge

Create a map of the South Holland as follows:

  1. Import and plot nl-gemeenten.shp. Adjust line width as necessary.
  2. Layer the boundary of Delft onto the plot.
  3. Add a title.
  4. Add a legend that shows both the privince boundaries (as a line) and the boundary of Delft (as a filled polygon).
boundary_ZH <- municipal_boundary_NL %>% 
  filter(ligtInPr_1 == "Zuid-Holland")
ggplot() +
    geom_sf(data = boundary_ZH, aes(color ="color"), show.legend = "line") +
    scale_color_manual(name = "", labels = "Municipal Boundaries", values = c("color" = "gray18")) +
    geom_sf(data = boundary_Delft, aes(shape = "shape"), color = "purple", fill = "purple") +
    scale_shape_manual(name = "", labels = "Municipality of Delft", values = c("shape" = 19)) +
    ggtitle("Delft location") +
    theme(legend.background = element_rect(color = NA)) +
    coord_sf()

Export a shapefile

To save a file, use the st_write() function from the sf package.

st_write(leisure_locations_selection,
         here("data_output","leisure_locations_selection.shp"), driver = "ESRI Shapefile")